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Abstract 

Motivated by recent advances in the spectral theory of auto-covariance matrices, we are led to revisit 
a reformulation of Markowitz’ mean-variance portfolio optimization approach in the time domain. 

In its simplest incarnation it applies to a single traded asset and allows to find an optimal trading 
strategy which — for a given return — is minimally exposed to market price fluctuations. The 
model is initially investigated for a range of synthetic price processes, taken to be either second order 
stationary, or to exhibit second order stationary increments. Attention is paid to consequences of 
estimating auto-covariance matrices from small finite samples, and auto-covariance matrix cleaning 
strategies to mitigate against these are investigated. Finally we apply our framework to real world 
data. 


I. INTRODUCTION 


When seeking an optimal strategy for capital allocation one can adopt a dynamic programming approach 
that requires solving a Hamilton-Jacobi-Bellman or Bellman equation to find such a strategy. An 

alternative approach, typically applied to single period problems, is mean-variance optimization, which forms 
the basis of Markowitz’ portfolio optimization theory Q. This approach has a rich history in economic 
research and industrial practice [l0l - ll4| . One of the main reasons for its popularity is clearly its conceptual 
simplicity, which helps in building an intuition about the nature of risk and its relation to an investment’s return. 


The last couple of decades have seen many physicists becoming interested in this very same question [lB - [^ . 
Key issues addressed in these studies concern the effects that sampling noise is likely to have on the measurement 
of correlations or covariances in large portfolios, the way in which such sampling noise is going to affect the 
solution of a subsequent mean-variance portfolio optimization problem, and the design of methods to mitigate 
against adverse effects of such sampling noise. 

The bedrock of most of these studies is the theory of random sample covariance matrices Their spectral 
theory was pioneered by Marcenko and Pastur in the 1960’s. It has indeed been observed that — apart 
from a number of large eigenvalues — the bulk of the spectrum of sample-covariance matrices of asset returns 
in various markets is very close to the form predicted by Marcenko and Pastur for sample covariance matrices of 
i.i.d. random data; see e.g. [mil- This type of comparison between market data and a null-model defined by 
random data could then be used to devise theory-guided ways of distinguishing between information and noise 
in market data, and thereby to devise methods to clean covariance matrices of asset returns for the purpose of 
their subsequent use in portfolio optimization, with the effect of improving risk-return characteristics |15l ll7H26l |. 

The present study was triggered by the fact that the spectral theory of sample auto-covariance matrices — the 
analogue of [2^ in the time domain — has recently become available [2^. This leads us to revisit the analogue 
of Markowitz mean-variance optimization in the time domain , which in its simplest incarnation allows to 
find an optimal trading strategy for a single traded asset over a finite (discrete) time horizon. We investigate 
this setup for a range of synthetic processes, taken to be either second order stationary, or to exhibit second 
order stationary increments, and we systematically study the effects of sampling noise on optimal strategies and 
on risk-return characteristics. Finally we apply our framework to daily returns of the S&P500 index, and we 
explore how results obtained for spectra of sample auto-covariance matrices obtained in could then be used 
as a guide to clean sample auto-covariance matrices in a spirit analogous to that used for sample-covariance 
matrices in the context of portfolio optimization. 

We note at the outset that we regard this as an exploratory study, and that we ignore economic factors such as 
discounting and agents’ asymmetric perceptions of gains and losses in the present paper. We expect that the 
primary area of application of our techniques would be in the high-frequency domain, as return auto-correlations 
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will be most prominent at short times. We note, however, that much of our analysis is about effects of sampling 
noise on optimal trading strategies, which is relevant at all time scales, and thus also for weakly correlated data. 

The remainder of this paper is organized as follows. In Sect. II we briefly describe Markowitz’ approach to 
portfolio optimization, and its translation into the time domain. In Sect. Ill we provide results for synthetic 
processes, and numerically investigate the influence of sampling noise on optimal strategies and risk-return 
profiles. In Sect. IV we look at optimal trading strategies for empirical data, using the S&P500 index as 
an example and we investigate the effect of auto-covariance matrix cleaning on risk-return profiles, based 
on comparing auto-covariance spectra for the S&P500 and expected spectra for a process with uncorrelated 
increments. Sect. V is devoted to a final overview, and an outlook on promising future research directions. 


II. PORTFOLIO OPTIMISATION 

A. The Markowitz Set-Up 


In the simplest version of mean-variance portfolio optimization one considers a set of N tradable assets i = 
1,... ,7V. It is usually assumed that these do not include complex financial instruments such as derivatives, 
options and futures. An investor can take positions on these assets. We will use to denote the position on 
asset i, using the convention that > 0 represents a long position (buying the asset), whereas < 0 represents 
a short position (selling asset). With denoting the (random) return on the i-th asset, the return on the entire 
portfolio with positions tt = (tti, 7r2,..., tt^v)^ is given by 

N 

R{tv) = ^ mn = Tv'r , (1) 

where r = (ri, r 2 ,..., r^Y is used to denote the vector of random returns and the prime indicates a transpose. 
The optimal portfolio according to Markowitz is the one that minimizes the variance of the portfolio return, 

N N 

Var[i?(7r)] = ^ iTinj {{n - ^i)(rj - Hj)) = ^ Tr^Tr^Sy , (2) 

i,j=l 

subject to the constraint of a given expected portfolio return /ip 

N N 

HP = {R{tv)) = ^ tt, (n) = Yi '^il^i ■ (3) 

In ((2), S = (Sij) is the covariance matrix of asset returns. 

To put a scale to the problem, one usually imposes the normalization constraint 

N 

TV'I = = 1 . (4) 

Here 1 = (1,1,..., 1)' denotes the N dimensional vector with all components equal to 1. The minimization 
problem is solved using the method of Lagrange multipliers to take the constraint of expected return and 
normalization into account, i.e. one looks the stationary point of the Lagrangian 

C = -tt'Ett — Ai(7r'l — 1) — A2(7r'/i — fip) (5) 

w.r.t variations of the TTj, Ai and A 2 . Elementary linear algebra then entails that the optimal portfolio tt* takes 
the form 


TT* = AiE-il-hA2E”V , 


( 6 ) 


with actual values of the Lagrange parameters Ai and A 2 determined by the constraints. 
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B. Translation into the Time-Domain 


The Markowitz portfolio optimization problem allows a fairly straightforward translation into the time-domain. 
To formulate it, assume that X = (Xt)tez is the price process for a single traded asset. Let tt* denote the 
trading position that an investor takes on this asset at time t. As in the above we shall use the convention that 
TTt > 0 represents a long position (buying the asset), whereas tt^ < 0 represents a short position (selling the 
asset). 

The return of a trading strategy tt = (tti, 7 r 2 ,..., ttt)' over a finite time horizon of T time steps for a realization 
X = (xi,X 2 , ■. ■ ,xt)' of the price process can be written as 

T 

Rt{tt\xo) ='^7rt{xo - xt) ■ (7) 


In terms of these conventions the expected return /X 5 of a trading strategy (conditioned on the initial price Xq) 
is 

T 

Ms = (-R(7r|a;o)) = n{xo - l^t) = xq - , (8) 

t=i 

where we have restricted ourselves in the second step to normalized trading strategies satisfying ttI = 1 ', and 
where = {xt) denotes the expected price at time t. 

It is worth remarking at the outset that X could alternatively (and perhaps even more appropriately in the 
present context) be thought of as the log-price process, in which case Rt{tv\xo) would be the log-return of the 
strategy tt. For the sake of simplicity and definiteness we shall stick to the language of price processes and 
returns in what follows. 

An optimal trading strategy in the spirit of Markowitz would then be a strategy which minimize the (conditional) 
variance 

T T 

Vai[RT{TT\Xo)] = y] TrtTTt’ {{xt - Ht){xt' - TTtTTt''^tt' , (9) 

subject to the constraints of normalization tt'I = 1 and given mean return tt'/i = xq — fJ-s- In ®) the matrix 
S = (Stt') now denotes the OMto-covariance matrix of the price process. 

The algebraic side of the problem of finding an optimal trading strategy is now formally fully equivalent to that 
of finding an optimal portfolio, and the optimal strategy tt* takes the form 

TT* = AiS-U-f A 2 S-V , (10) 


with S now the aitfo-covariance matrix of the price process rather than the covariance matrix of portfolio 
returns. Actual values of the Lagrange parameters Ai and A 2 are determined by the constraints as before. 

It is well known, and indeed easily verified that the globally optimal solution which does not impose a restriction 
concerning the mean return is compactly given by 


^GO — 


S-U 
I'S-il ■ 


( 11 ) 


The main problem facing both portfolio optimization a la Markowitz, and the mean-variance approach to finding 
optimal trading strategies is that covariance matrices of portfolio returns or auto-covariance matrices of price 
processes of traded assets are not known, but need to be estimated from empirical market data. The effects of 
sampling noise in such estimation processes are well studied in the case of portfolio optimization. As mentioned 
in the introduction, various strategies to mitigate against such effects — typically guided by random matrix 
theory — have been investigated in the past. 

By contrast, the corresponding random matrix theory for sample auto-covariance matrices that might be invoked 
for similar purposes for the problem of mean-variance formulations of optimal trading strategies has only recently 
become available [ 2 ^. We shall address the issue of sampling noise in empirical data and the use of spectral 
theory for the purpose of guiding the choice of “cleaning”-strategies for auto-covariance matrices of market data 
below in Sect. IV. Before that we investigate the effects of sampling noise for some synthetic processes where 
comparison with known true auto-covariance matrices is possible. 
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III. RESULTS FOR SYNTHETIC PRICE PROCESSES 


In this section we evaluate the theory developed in the previous section for synthetic price processes. We begin 
by taking these processes to be either white noise processes or auto-regressive processes of order 1, and then move 
on to look at the situation where piice-increments are modelled as white-noise and auto-regressive processes, 
respectively. For the white noise and auto-regressive price processes, the true auto-covariance matrices are 
known, and analytical expressions for optimal trading strategies can be given. We then look at the effects of 
sampling noise, using estimates of auto-covariance matrices for various values of the ratio of a = T /M of the 
length T of the risk horizon (and thus the matrix dimension) and the sample size M used to determine these 
estimates. The analytical expressions for the true auto-covariance matrices correspond to the a —^ 0-limit in 
these results. 


A. Synthetic Stationary Price Processes 


We first consider a price process with fluctuations around the trend dxt = Xt — fit taken to be a Gaussian white 
noise process, i.e. SXt ~ A/'(0,ct^). The true auto-covariance matrix in this case is proportional to the unit 
matrix, i.e. 

The globally optimal strategy (HU for a time horizon of length T in this case is then readily found to be 


TT, 


t,GO 




1 

T 


( 12 ) 


Thus, for a white noise process with variance cr^ the optimal strategy ttqq = (l/T, 1/T,..., 1/T)' is uniform over 
the time horizon T, and independent of the variance of the price process. The analogous result for a Markowitz 
portfolio of uncorrelated assets is, of course, well known. 

Let us next assume that price fluctuations around the trend are described by an AR(1) process, i.e. an auto¬ 
regressive process of order 1 of the form 


SXt = aSXt_i+(^Vl-a^)^t , 


(13) 


in which 1); for simplicity, we have normalized the process to exhibit fluctuations of variance 1. The 

parameter a in (HU is required to satisfy |o| < 1 for fluctuations to be stationary. The auto-covariance function 
of this process is known to be given by 

7(^) = Cov[(5At^W_,] = a'l (14) 


The auto-covariance matrix evaluated for a finite time horizon of length T is thus a Toeplitz matrix of the form 

/I a ■■■ a^-i\ 

a 1 a : 

ala 

a^ a 1 ■. a? 


a? a 1 / 


S = 


Va^-i 

Its inverse is a tridiagonal matrix given by 

(I -a 0 

—a 1 -I- —a 

1 


(15) 


= 


1 — 


0 


0 \ 


0 


V 0 


■■ —a 1-I -—a 
0 —a 1 / 


(16) 
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The globally optimal strategy (HU for a time horizon of length T in this case is then given by 

7r5o = Ai(l,l-a,...,l-a,l)' , (17) 

with Ai = [2 + (T — 2)(1 — a)]“^ fixed by the normalization-constraint tt'I = 1. In this case the globally optimal 
trading strategy turns out to be uniform apart from the two boundary terms. The white noise result is clearly 
recovered as the a —>■ 0-limit of the present result for the AR(1) process as it should. 

Solutions with constraints on the expected return can be given in closed form as well; they are simply obtained 
by inserting (HH) into (HOD, with Lagrange parameters obtained by solving a pair of linear constraint-equations; 
details will of course depend on assumptions concerning the drift, and we refrain from writing them down 
explicitly. 

Fig. 1 shows optimal strategies for an AR(1) price process with parameter a = 0.8, both for the global optimum 
as well as for cases with non-zero mean returns imposed. As can be seen from the figure, increasing the expected 
strategy return from fis = 4.0 x 10“'* to /is = 1.0 x 10“^ changes the optimal strategy uni from one that is 
monotone decreasing over the risk-horizon to one which is monotone increasing ^ and starting in fact with a 
(short-)selling position at the initial time-step t = 1. 




FIG. 1: Left panel: Globally optimal trading strategy for an AR(1) price process with a = 0.8 over a risk horizon of 
T = 10 time steps. Right panel: optimal strategies for a process with the same parameter a and a linear drift of the 
form gt ~ 10“'*t, imposing expected strategy returns of /is = 4 x 10“® (blue solid line) and /is = 1 x 10“® (solid orange 
line). 


B. Synthetic Price Processes with Stationary Increments 


The stationarity assumption for the price process used in the previous subsection is clearly unrealistic, and 
there is obviously need to go beyond that, if the methods discussed in the present investigation are to be useful 
in practice. 

However, once the realm of stationarity is left, some structure is needed on a different level in order to make op¬ 
erational sense of estimating auto-covariance functions and the corresponding auto-covariance matrices defined 
over a finite time horizon. The structure we shall rely on here is based on the assumption that (fluctuations of) 
price-process can be described as having stationary increments. If one adopts the reading that the processes 
considered here are actually log-price processes, the assumption of stationarity of their increments is actually a 
popular assumption in much of Mathematical Finance. 

In what follows we assume that the (log-) price process X = (Xt) exhibits stationary increments, i.e. that 


At = At_i + Yt 


(18) 


with Yt = (Yt) + SYt = fXt — /tt-i -I- SYt with zero-mean fluctuations 6Yt. In terms of these conventions we can 
















6 


write the return of a strategy tt = (tt^) for a given realization x as 


^ T^tixo - a;*) = ^ tt* (^q “ Mt) “ X! '^2/- 


(19) 


t=l T=1 

The expected return is given by the first contribution on the r.h.s, while the variance is 

T t t' 


Var[i?T(^)] = {SyrSyr') 




T—1 t' — 1 


( 20 ) 


This is of the same structure as with the auto-covariance matrix E = = (Ej’^j,) of the non-stationary 

price process expressed in terms of the auto-covariance matrix E^ = (E)^^,) of the process of price increments 
as 


Sm' = E E i^yrSyr') = E E ■ ( 21 ) 

r — lT' = l T—lr' — l 

This relation between the auto-covariance matrices of process and the corresponding process of increments can 
be compactly expressed in matrix form as 


E^ = PE’^P' , 

where P is a lower triangular constant matrix of ones, 

/ 1 0 0 ... 0 \ 

1 1 0 ... 0 

1 1 1 ... 0 

Vl 1 1 ... 1/ 


( 22 ) 


(23) 


The mean variance approach to strategy optimization then yields optimal trading strategies of the form m, 
with the auto-covariance matrix E = E^ of the price process expressed in terms of the auto-covariance matrix 
E^ of the process of stationary increments according to Eq. (1221) 


Taking the price increments to be a white noise process SYt ~ A/"(0,ct^), we have E)^^, = a^Stp so E ^ 
^- 2 (pp/)-i^ where (PP')“^ is found to be of tridiagonal form, 


{PP')-^ 


2 

1—I 

1 

0 

0 .. 

. 0 

1—I 

1 

2 

1—I 

1 

0 .. 

. 0 

0 

1—I 

1 

2 

1—I 

1 

. 0 


0 

0 .. 

.-12-1 

0 

0 .. 

1—I 

I 




/ 


(24) 


The globally optimal strategy (HU in this case is then simply 


r£o = (l,0,0,...0)' 


(25) 


i.e., it consists of taking a single long position at the initial time step. 

If we assume an AR(1) process, of the form eq. (I13L for the fluctuations of the price increments, i.e. 


SYt = aSYt-i + (^\/l - , 


(26) 


then it is E^ which is given by Eq. (HU; it turns out that E ^ = (PE^P') too, can be evaluated in closed 
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form, giving 


/ C -A^ a 0 . ••• 0 \ 

-A^ 2B -A^ a 0 : 

a -A^ 2B -A^ a 0 : 

0 ■■■ ■■■ ; 



: ■■■ 0 

: 0 a -A^ 2B -A^ a 

: 0 a -A^ C -A 

Vo . 0 a -A 1/ 

in which we use the abbreviations A=l + a, B = l + a A and C = 1 + 

In this case the globally optimal strategy m is of the form 

7r£o = (l + a>-a.0,..-,0)' , (27) 


i.e. it consists of taking a single long position at the first time-step, which is then partially offset by a short 
position at the second time step if a > 0, whereas it is followed by a further long position if successive price 
increments are anti-correlated (a < 0). Note that the solution for white noise increments is correctly recovered 
as the a —>■ 0-limit of the AR(1) results. 

Once more, solutions with constraints on expected returns can be given in closed form; in analogy to the 
procedure described for the case of stationary price processes, they are obtained by inserting (HU) into m, 
with Lagrange parameters obtained by solving a pair of linear constraint-equations. 

We hnd, and shall demonstrate below that the procedure predicts non-trivial changes of strategy as constraints 
on expected returns are varied. Once more, details will depend on assumptions concerning the drift, and we 
refrain from producing explicit equations here. We will report our analytical results alongside numerical results 
which take sampling errors arising from finite sample fluctuations on estimated auto-covariance matrices into 
account 


C. The Effects of Sampling Noise 


Having analytical results for synthetic price processes available allows one to estimate the effects of sampling 
noise on optimal strategies and on risk return profiles. In practice, the analytic structure of an underlying price 
process will not be known, and auto-covariance matrices will have to be estimated on the basis of finite samples, 
i.e. the design of optimal strategies will have to be based on sample auto-covariance matrices S. 

For a stationary price process, samples taken along a realization of the process can be taken to define the 
elements of E via 


E 




1 

M- 1 


M 

Sxt+t,dxt 

'+M ■ 


(28) 


This procedure introduces sampling noise; estimated auto-covariance matrix elements will exhibit 

fluctuations about their corresponding true counterparts 'Btw■ When assessing the effects of sam¬ 
pling noise via the influence on spectra, one expects the relevant parameter to be the aspect ratio a = T/M, 
i.e. the ratio of the number of time-lags considered and the sample-size used to estimate matrix elements. We 
shall use this parameter in what follows to parametrize the influence of sampling noise, with the a 0-limit 
corresponding to the situation without sampling noise, i.e. with true asymptotic auto-covariances known. 

If the price process is not stationary, but has stationary increments, one can use Eqs. m and (1^ to express 
the auto-covariance matrix E'’^- of the price process in terms of the auto-covariance matrix E^ of the process of 







x10'^ 



FIG. 2: Risk-return profile for an AR(1) price process with the same parameters as in Fig. |4] for various levels of 
sampling noise parameterized by a. Results are obtained by averaging over 10^ samples as in Fig|3] Note in particular 
that sampling noise leads to an under-estimation of risk. The two horizontal dashed lines indicate two values of the 
target return for which optimal trading strategies are reported in Fig |4] below. 


price increments. For the latter it is legitimate to use an estimator by sampling along a realization, so one can 
define via 


yY _ 


M 


M-1 ^ 


5yt+u^yt'+u 


(29) 


and 


= Pt^P' . (30) 

In Fig. [5] we show the risk-return profile for the case of an AR-1 price process for various aspect ratios a, 
ranging from a = 0.5 down to a = 10“"'^, with the noise-free case a = 0 also included. Note that sampling noise 
leads to a systematic underestimation of risk, though results quickly approach the noise-free limit as a becomes 
small. 

Fig. [3] exhibits the weights of the globally optimal (minimum risk) trading strategy for this process, while 
Fig. m gives weights of optimal trading strategies for two different values of the target return (indicated by 
the two horizontal dashed lines in Fig. [5] In this case we assume a small drift /i* = of the underlying 

price process. It is noticeable that an increase in the required target return leads to a qualitative change of the 
optimal strategy, with the larger target return requiring to take an initial short position at the beginning of the 
trading period. 

Turning to the situation where we use an auto-regressive process to describe the statistics of price increments, 
we see from a comparison of Figs. [5] and [2] that risk levels are significantly larger compared to the situation 
where the same underlying process describes the fluctuations of the price process itself. 

This concludes our collection of results for synthetic price processes, where the underlying true auto-covariances 
are known. We now turn to applying the framework to empirical data, where this is not the case. 


IV. EMPIRICAL DATA 


In what follows we apply our framework to empirical data, using daily adjusted close data of the S&P500, 
spanning the period 03 Jan 1950 to 20 Apr 2015. 










9 



t 


FIG. 3: Globally optimal trading strategies for an AR(1) price process with a = 0.8 over a risk horizon of T = 10 time 
steps, using estimated auto-covariance matrices. Data are shown for various values of the ratio a = T/M of risk horizon 
and sample size M used to estimate auto-covariances according to Eq. (I28II : optimal strategies (with solid lines as guides 
to the eye) are obtained by averaging over 10^ samples. Standard deviations are also shown; they rapidly decrease with 
a. Results obtained for the true auto-covariance function (the a —> 0-limit) are included for comparison. Note that 
average strategies obtained for finite samples are very close to the a = 0 results. 




FIG. 4: Left panel: Optimal strategies for an AR(1) process with a — 0.8 and a linear drift of the form fit = W~'^t as in 
Fig. 1, with imposed expected strategy return of /rs = 4 x 10~®. Shown are average trading strategies for various levels 
of sampling noise parameterized by non-zero a, obtained by averaging over 10^ samples. Average results are close to 
those obtained using true asymptotic auto-covariance matrices in the a —^ 0-limit, which are included for comparison. 
Right panel: optimal trading strategies for an AR(1) process with the same parameters as in the left panel, but now 
with /rs = 1 X 10“®. 


This is perhaps the point to notice that we are not advocating that using the variance of trading strategy returns 
constitutes the best way of capturing risk in real market data. Indeed, given that market returns are known 
to have fat-tailed distributions, variance can at best be regarded as a proxy for risk. Howevever, our primary 
goal here is not to explore a wider family of possible risk measures, but rather to define a reformulation of the 
popular mean-variance optimization strategy in the time domain, and to begin investigating its properties. 
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FIG. 5: Left: Optimal strategies for a setup where the fluctuations of the price-increments are described by an AR(1) 
process with a = 0.8; a linear drift of the form fit = 10~^t is assumed for the price process, and an expected strategy 
return of = 1 x 10“® is imposed. Shown are average trading strategies (solid lines) obtained by averaging over 10^ 
samples for various levels of sampling noise parameterized by non-zero a. Average results are close to those obtained 
using true asymptotic auto-covariance matrices in the a —>■ 0-limit, which are included for comparison. Right: risk-return 
profile for this setup, with the horizontal dashed line indicating the expected strategy return imposed in the data of 
the left panel. The right panel should be compared with Fig[2l which exhibits the risk return profile for an AR-1 price 
process. 


A. The Spectrum of the S&PSOO Auto-Correlation Matrix 


Before turning to the evaluation of optimal trading strategies and risk-return profiles we shall have a look at 
the spectrum of the auto-covariance matrices of the data, taking time windows of T = 50, and sample sizes 
of M = 100, hence a = 0.5. Auto-covariance matrices of the price process are obtained as described in Sect. 
IIII C[ by first evaluating auto-covariances of the return process, assuming stationarity across individual sample- 
windows. In order to obtain meaningful statistics across the entire data set, we transform the return series in 
each time window to exhibit unit-variance increments, and then obtain auto-covariances of the thus normalized 
price process using the transformation Eq. dSO]). 



ln(A,) 


FIG. 6: Spectrum of the sample auto-covariance matrix of the S&P500, normalized as described in the main text, using 
T = 50 time lags and an aspect ratio a = 0.5, i.e. samples of size M = 100 to define the auto-covariances (red full line). 
Also shown is a comparison with the spectrum of an auto-covariance matrix for a price process with independent unit 
variance increments (green dashed line). The two are remarkably close. 

As can be seen in Fig. [51 where we plot the density of logarithms of eigenvalues, the spectrum is very broad, 
spanning several orders of magnitude. For comparison we include the spectrum for a process with independent 
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unit variance increments using the same values of T and M, and we notice that the two are remarkably close. 
This is not completely unanticipated, as it is one of the widely reported ‘stylized facts’ in the field that return- 
series have very short correlation-times. We will use this type of spectral comparison below to inform the 
auto-covariance matrix cleaning strategy that we will use for the purpose of noise reduction. 


B. Optimal Trading Strategies and Auto-Covariance Matrix Cleaning 


In Fig [7] we report the risk-return characteristics for optimal trading strategies on the S&P500, using sample- 
auto-covariance matrices of T = 50 time lags, and sample size M = 100 as in Fig. [B] We report results obtained 
for auto-covariance matrices, as measured via Eqs. (1291) and dsni), and compare them with results obtained by 
applying a cleaning strategy to these, which we shall describe below. We use realized returns defined by linear 
trends in each data window to compute risk-return profiles, and use conventions for in-sample risk, true risk 
and and out-of-sample risk as in , taking the average auto-correlation matrix across the entire time series as 
a proxy for the true auto-correlation. Note that the reduction of risk that can be obtained through cleaning is 
substantial. 



FIG. 7: Risk-return profile of optimal trading strategies on the S&P500 data. Left: risk-return profile obtained from 
measured auto-covariance matrices. Right: risk-return profile obtained using cleaned versions of auto-covariance matrices. 
Horizontal dashed lines denote target strategy returns ps for which optimal strategies are reported in Fig. [8] below. 




FIG. 8: Left: globally optimal strategy for the S&P500, showing both results before and after cleaning. Right: optimal 
strategies for the two target returns of ps = 0.01 and 0.06 indicated in Fig. [7] above. 
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Fig. |S] exhibits optimal trading strategies for the S&P500, showing both the minimal risk solution and risk- 
optimal solutions for two different non-zero target strategy returns. Apart from the effect of reducing risk, we 
find that the effect of cleaning is also to create strategies that “smoother” than those obtained without cleaning. 

Let us finally turn to the cleaning strategy that is used to obtain the data described above. In the context 
of covariance matrices of financial data, strong similarities were observed between empirical correlation matrix 
spectra and the Marcenko-Pastur law expected for high-dimensional uncorrelated data. One of the cleaning 
strategies that has been suggested due to such similarities is referred to as ‘clipping’ [mil- It analyses 
correlation matrices by performing a spectral decomposition, and regards the bulk of a sample correlation 
matrix spectrum, which resembles the Marcenko-Pastur law, as noise. It then transforms correlation matrices 
by keeping large eigenvalues outside the bulk, and replacing those in the bulk by their average, thereby avoiding 
small eigenvalues in the transformed matrix. 

In the present case, the phenomenology is rather different; there are no eigenvalues of the (normalized) sample 
auto-covariance matrices which can be regarded as lying significantly outside the bulk of the spectrum predicted 
for uncorrelated increments. So there would be no clear guidance coming from random matrix theory that could 
form the basis of a clipping-type procedure. 


We therefore decided to apply a ‘shrinkage’ procedure to our data. To the best of our knowledge this procedure 
was first proposed W Stein 32|, and has recently found renewed interest in the Mathematical Statistics 
and Econophysics 33| communities. 


Based on the observation reported in Fig [5] that the (normalized) auto-covariance spectra of the S&P500 and 
of a synthetic process with independent increments are indeed rather similar, we apply the shrinkage procedure 
to the sample auto-covariance matrixes of the S&P500 increments shrinking them towards a target matrix 
D given by the diagonal matrix of variances of the increments (which would indeed describe a process of 
independent increments), i.e. towards D = diag({St_t}), using the substitution rule 




(31) 


and transforming the shrunk thus obtained to define the cleaned estimate of E^ using the transformation 
Eg. (1221) . The proper value for the parameter S in this procedure is determined from the data as described in 


V. SUMMARY AND DISCUSSION 


To summarize, in the present paper we have a reformulation of Markowitz’ mean-variance optimization in the 
time domain to obtain optimal trading strategies for a single traded asset over a finite discrete time horizon. 
Using simple linear algebra, one obtains such optimal trading strategies as sequences of buy, hold, and sell 
instructions for that asset, which minimize the market fluctuations of the return generated by this sequence 
of instructions over a given time horizon, subject to suitable constraints. The procedure requires the auto¬ 
covariance matrix of the price process (and estimates for expected prices) during the risk horizon as input. 

We investigated this problem for a number of synthetic price processes, taken to be either second order stationary 
or be described by second order stationary increments. Analytic expressions are given for the cases where the 
price and the return processes are described by i.i.d. or by auto-regressive fluctuations. 

We compare analytic solutions with numerical results for situations where auto-covariance matrices have to 
be estimated from finite samples, which is the situation typically encountered in practice. For the synthetic 
processes for which true auto-covariance matrices are known the effects of sampling noise on optimal strategies 
and on risk-return profiles can thus be quantitatively assessed. We find that in general sampling noise leads to 
an underestimation of risk, but that asymptotic results are well approximated when samples used to estimate 
auto-covariance matrices are sufficiently large. A ratio a = T/M < 0.1, i.e. sample sizes ten times the length 
of the risk-horizon appears to be desirable from this point of view. 

From the financial point of view on the other hand, it is always desirable to use time series as short as possible 
for estimation, to avoid letting (possibly) outdated data influence current trading strategies. Small samples, 
however, increase the effects of sampling noise, and it is for this reason that cleaning strategies have an important 
role to play. Looking at the S&P500 data, we found that (normalized) auto-covariance spectra closely resemble 
those one would expect for price processes with independent increments, and it is this observation that motivates 
our choice of target matrix within a shrinkage cleaning strategy. 
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We observe that auto-covariance matrix cleaning gives rise to smoother trading strategies, and that it also leads 
to a reduction of risk in risk-return profiles. 

A natural generalization of the present work would deal with a multi-period multi-asset version of a mean- 
variance formulation of optimal trading strategies. While some work has been done in this direction in the past 
(see. e.g. [1^ and references therein) the solution presented in remains somewhat formal, and restricted to 
the case without correlations in time. We are not aware of an investigation of the effects of sampling noise in 
the multi-period multi-asset case. Indeed the spectral theory for that case which would be useful to motivate 
and design cleaning strategies has not been developed as of now. 

Another direction that could be pursued is to include higher moments of strategy-return distributions in mea¬ 
sures of risk, in order to better capture risk in the presence of fat-tailed return distributions. The translation 
into the time-domain, as advocated in the present paper would in general involve A:-point correlations of returns 
in time (where fc > 3). Assessing sampling noise in such a situation would then clearly transcend the realm of 
random matrix theory 
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